ABSTRACT
In recent years, COVID-19 has impacted all aspects of human life. As a result, numerous publications relating to this disease have been issued. Due to the massive volume of publications, some retrieval systems have been developed to provide researchers with useful information. In these systems, lexical searching methods are widely used, which raises many issues related to acronyms, synonyms, and rare keywrds. In this paper, we present a hybrid relation retrieval system, CovRelex-SE, based on embeddings to provide high-quality search results. Our system can be accessed through the following URL: https://www.jaist.ac.jp/is/labs/nguyen-lab/systems/covrelex-se/. © 2023 Association for Computational Linguistics.
ABSTRACT
COVID-19 crisis has led to an outburst of information that needs to be organized, validated, and made available to the seekers. Despite the rapid growth and success of BERT models in the last 3 years, COVID QA is a difficult task due to the lack of applicable datasets and a relevant language representation. Therefore, this study proposes a transformer-based Question Answering (QA) model for COVID-19 questions from the biomedical domain. Further, explored several datasets, and models required for question type prediction, no-Answer prediction, and answer extraction and transfer learning strategies. It has been demonstrated that the exact match score can be significantly improved with limited amounts of training data from the biomedical domain. Finally, the findings of the study have been summarized as Factoid QA Finetuning Framework (FQFF), which can provide initial direction for domain-specific QA tasks with a limited amount of data. © 2023 IEEE.
ABSTRACT
Question Answering based on Knowledge Graph (KG) has emerged as a popular research area in general domain. However, few works focus on the COVID-19 kg-based question answering, which is very valuable for biomedical domain. In addition, existing question answering methods rely on knowledge embedding models to represent knowledge (i.e., entities and questions), but the relations between entities are neglected. In this paper, we construct a COVID-19 knowledge graph and propose an end-to-end knowledge graph question answering approach that can utilize relation information to improve the performance. Experimental result shows that the effectiveness of our approach on the COVID-19 knowledge graph question answering. Our code and data are available at https://github.com/CHNcreater/COVID-19-KGQA. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
ABSTRACT
This paper presents CovRelex, a scientific paper retrieval system targeting entities and relations via relation extraction on COVID-19 scientific papers. This work aims at building a system supporting users efficiently in acquiring knowledge across a huge number of COVID-19 scientific papers published rapidly. Our system can be accessed via https://www.jaist.ac.jp/is/labs/ nguyen-lab/systems/covrelex/.
ABSTRACT
Social media contains unfiltered and unique information, which is potentially of great value, but, in the case of misinformation, can also do great harm. With regards to biomedical topics, false information can be particularly dangerous. Methods of automatic fact-checking and fake news detection address this problem, but have not been applied to the biomedical domain in social media yet. We aim to fill this research gap and annotate a corpus of 1200 tweets for implicit and explicit biomedical claims (the latter also with span annotations for the claim phrase). With this corpus, which we sample to be related to COVID-19, measles, cystic fibrosis, and depression, we develop baseline models which detect tweets that contain a claim automatically. Our anal-yses reveal that biomedical tweets are densely populated with claims (45 % in a corpus sampled to contain 1200 tweets focused on the domains mentioned above). Baseline classification experiments with embedding-based classifiers and BERT-based transfer learning demonstrate that the detection is challenging, however, shows acceptable performance for the identification of explicit expressions of claims. Implicit claim tweets are more challenging to detect. © 2021 Association for Computational Linguistics